Tempe: An Interactive Data Science Environment for Exploration of Temporal and Streaming Data
نویسندگان
چکیده
Over the last two decades, data scientists performed increasingly sophisticated analyses on larger data sets, yet their tools and workflows remain low-level. A typical analysis involves different tools for different stages of the work, requiring file transfers and considerable care to keep everything organized. Temporal data adds additional complexity: users typically must write queries offline before porting them to production systems. To address these problems, this paper introduces Tempe, a web application providing an integrated, collaborative environment for both real-time and offline temporal data analysis. Tempe's central concept is a persistent research notebook retaining data sources, analysis steps and results. Analysis steps are carried out in script editor that uses a live programming approach to display interactive, progressively updated visualizations. Tempe uses a temporal streaming engine, Trill [17], as its backend data processor. In the process of creating Tempe, we have discovered new interactivity and responsiveness requirements for Trill. Conversely, building around Trill has shaped the user experience for Tempe. We report on this cross-disciplinary design process to argue that end user experience can be an integral part of creating a data engine.
منابع مشابه
An Improvement in Temporal Resolution of Seismic Data Using Logarithmic Time-frequency Transform Method
The improvement in the temporal resolution of seismic data is a critical issue in hydrocarbon exploration. It is important for obtaining more detailed structural and stratigraphic information. Many methods have been introduced to improve the vertical resolution of reflection seismic data. Each method has advantages and disadvantages which are due to the assumptions and theories governing their ...
متن کاملDesign and Test of the Real-time Text mining dashboard for Twitter
One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...
متن کاملA Method to Reduce Effects of Packet Loss in Video Streaming Using Multiple Description Coding
Multiple description (MD) coding has evolved as a promising technique for promoting error resiliency of multimedia system in real-time application programs over error-prone communicational channels. Although multiple description lattice vector quantization (MDCLVQ) is an efficient method for transmitting reliable data in the context of potential error channels, this method doesn’t consider disc...
متن کاملVideo Visual Analytics of Tracked Moving Objects
Exploring video data by simply watching does not scale for large databases. Especially, this problem becomes obvious in the field of video surveillance. Motivated by a mini challenge of the contest of the IEEE Symposium on Visual Analytics Science and Technology 2009 (Detecting the encounter of persons in a provided video stream utilizing the techniques of visual analytics), we propose an appro...
متن کاملSelection of new exploration targets using lithogeochemical data obtained for Taknar deposit located in NE of Iran
Taknar deposit is located about 28 km to the north-west of Bardaskan in the Khorasan-e-Razavi province, which is situated in the north-eastern part of Iran. This deposit is unique, formed within the Taknar formation in the Ordovician time. As a result, it is of much interest to many researchers working in this field. By choosing the lithogeochemical study performed to recognize new exploration ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014